2.2 Stochastic, Approximate and Neural Computing

نویسندگان

  • John Hayes
  • Rishad Shafik
  • Ghaith Tarawneh
  • Danil Sokolov
چکیده

Stochastic circuits (SCs) offer tremendous area and power-consumption benefits at the expense of computational inaccuracies. Managing accuracy is a central problem in SC design and has no counterpart in conventional circuit synthesis. It raises a basic question: how to build a systematic design flow for stochastic circuits? We present, for the first time, a systematic design approach to control the accuracy of SCs and balance it against other design parameters. We express the (in)accuracy of a circuit processing n-bit stochastic numbers by the numerical deviation of the computed value from the expected result, in conjunction with a confidence level. Using the theory of Monte Carlo simulation, we derive expressions for the stochastic number length required for a desired level of accuracy, or vice versa. We discuss the integration of the theory into a design framework that is applicable to both combinational and sequential SCs. We show that, for combinational SCs, accuracy is independent of the circuit's size or complexity, a surprising result. We also show how the analysis can identify subtle errors in both combinational and sequential designs. Download Paper (PDF; Only available from the DATE venue WiFi) 12:00 2.2.2 ENERGY-EFFICIENT APPROXIMATE MULTIPLIER DESIGN USING BIT SIGNIFICANCE-DRIVEN LOGIC COMPRESSION (Paper/SoftConf ID: 244) Speaker: Issa Qiqieh, School of Electrical and Electronic Engineering, Newcastle University, GB Authors: Issa Qiqieh, Rishad Shafik, Ghaith Tarawneh, Danil Sokolov and Alex Yakovlev, Newcastle University, GB Abstract Approximate arithmetic has recently emerged as a promising paradigm for many imprecision-tolerant applications. It can offer substantial reductions in circuit complexity, delay and energy consumption by relaxing accuracy requirements. In this paper, we propose a novel energy-efficient approximate multiplier design using a significance-driven logic compression (SDLC) approach. Fundamental to this approach is an algorithmic and configurable lossy compression of the partial product rows based on their progressive bit significance. This is followed by the commutative remapping of the resulting product terms to reduce the number of product rows. As such, the complexity of the multiplier in terms of logic cell counts and lengths of critical paths is drastically reduced. A number of multipliers with different bit-widths (4-bit to 128-bit) are designed in SystemVerilog and synthesized using Synopsys Design Compiler. Post-synthesis experiments showed that up to an order of magnitude energy savings, and reductions of 65% in critical delay and almost 45% in silicon area can be achieved for a 128-bit multiplier compared to an accurate equivalent. These gains are achieved with low accuracy losses estimated at less than 0.00071 mean relative error. Additionally, we demonstrate the energy-accuracy trade-offs for different degrees of compression, achieved through configurable logic clustering. In evaluating the effectiveness of our approach, a case study image processing application showed up to 68.3% energy reduction with negligible losses in image quality expressed as peak signal-to-noise ratio (PSNR). Download Paper (PDF; Only available from the DATE venue WiFi) 12:30 2.2.3 ENERGY-EFFICIENT HYBRID STOCHASTIC-BINARY NEURAL NETWORKS FOR NEAR-SENSOR COMPUTING (Paper/SoftConf ID: 78) Speaker: Vincent Lee, University of Washington, US Authors: Vincent Lee1, Armin Alaghi1, John Hayes2, Visvesh Sathe1 and Luis Ceze1 1University of Washington, US; 2University of Michigan, US Abstract Recent advances in neural networks (NNs) exhibit unprecedented success at transforming large, unstructured data streams into compact higher-level semantic information for tasks such as handwriting recognition, image classification, and speech recognition. Ideally, systems would employ near-sensor computation to execute these tasks at sensor endpoints to maximize data reduction and minimize data movement. However, near-sensor computing presents its own set of challenges such as operating power constraints, energy budgets, and communication bandwidth capacities. In this paper, we propose a stochastic-binary hybrid design which splits the computation between the stochastic and binary domains for near-sensor NN applications. In addition, our design uses a new stochastic adder and multiplier that are significantly more accurate than existing adders and multipliers. We also show that retraining the binary portion of the NN computation can compensate for precision losses introduced by shorter stochastic bit-streams, allowing faster run times at minimal accuracy losses. Our evaluation shows that our hybrid stochastic-binary design can achieve 9.8× energy efficiency savings, and application-level accuracies within 0.05% compared to conventional allbinary designs. Download Paper (PDF; Only available from the DATE venue WiFi)Recent advances in neural networks (NNs) exhibit unprecedented success at transforming large, unstructured data streams into compact higher-level semantic information for tasks such as handwriting recognition, image classification, and speech recognition. Ideally, systems would employ near-sensor computation to execute these tasks at sensor endpoints to maximize data reduction and minimize data movement. However, near-sensor computing presents its own set of challenges such as operating power constraints, energy budgets, and communication bandwidth capacities. In this paper, we propose a stochastic-binary hybrid design which splits the computation between the stochastic and binary domains for near-sensor NN applications. In addition, our design uses a new stochastic adder and multiplier that are significantly more accurate than existing adders and multipliers. We also show that retraining the binary portion of the NN computation can compensate for precision losses introduced by shorter stochastic bit-streams, allowing faster run times at minimal accuracy losses. Our evaluation shows that our hybrid stochastic-binary design can achieve 9.8× energy efficiency savings, and application-level accuracies within 0.05% compared to conventional allbinary designs. Download Paper (PDF; Only available from the DATE venue WiFi) 12:45 2.2.4 ACCELERATOR-FRIENDLY NEURAL-NETWORK TRAINING: LEARNING VARIATIONS AND DEFECTS IN RRAM CROSSBAR (Paper/SoftConf ID: 512) Speaker: Li Jiang, Shanghai Jiao Tong University, CN Authors: Lerong Chen1, Jiawen Li1, Yiran Chen2, Qiuping Deng3, Jiyuan Shen1, Xiaoyao Liang1 and Li Jiang4 1Shanghai Jiao Tong University, CN; 2University of Pittsburgh, US; 3Lynmax Research, CN; 4Department of Computer Science and Engineering, Shanghai Jiao Tong University, CN Abstract RRAM crossbar consisting of memristor devices can naturally carry out the matrix-vector multiplication; it thereby has gained a great momentum as a highly energy-efficient accelerator for neuromorphic computing. The resistance variations and stuck-at faults in the memristor devices, however, dramatically degrade not only the chip yield, but also the classification accuracy of the neural-networks running on the RRAM crossbar. Existing hardware-based solutions cause enormous overhead and power consumption, while software-based solutions are less efficient in tolerating stuck-at faults and large variations. In this paper, we propose an accelerator-friendly neural-network training method, by leveraging the inherent self-healing capability of the neuralnetwork, to prevent the large-weight synapses from being mapped to the abnormal memristors based on the fault/variation distribution in the RRAM crossbar. Experimental results show the proposed method can pull the classification accuracy (10%-45% loss in previous works) up close to ideal level with ≤ 1% loss. Download Paper (PDF; Only available from the DATE venue WiFi) 13:00 IP1-1, 298 STRUCTURAL DESIGN OPTIMIZATION FOR DEEP CONVOLUTIONAL NEURAL NETWORKS USING STOCHASTIC COMPUTING Speaker: Yanzhi Wang, Syracuse University, US Authors: Zhe Li1, Ao Ren1, Ji Li2, Qinru Qiu1, Bo Yuan3, Jeffrey Draper2 and Yanzhi Wang1 1Syracuse University, US; 2University of Southern California, US; 3City University of New York, City College, US Abstract Deep Convolutional Neural Networks (DCNNs) have been demonstrated as effective models for understanding image content. The computation behind DCNNs highly relies on the capability of hardware resources due to the deep structure. DCNNs have been implemented on different largescale computing platforms. However, there is a trend that DCNNs have been embedded Time Label Presentation Title AuthorsDeep Convolutional Neural Networks (DCNNs) have been demonstrated as effective models for understanding image content. The computation behind DCNNs highly relies on the capability of hardware resources due to the deep structure. DCNNs have been implemented on different largescale computing platforms. However, there is a trend that DCNNs have been embedded Time Label Presentation Title Authors into light-weight local systems, which requires low power/energy consumptions and small hardware footprints. Stochastic Computing (SC) radically simplifies the hardware implementation of arithmetic units and has the potential to satisfy the small lowpower needs of DCNNs. Local connectivities and down-sampling operations have made DCNNs more complex to be implemented using SC. In this paper, eight feature extraction designs for DCNNs using SC in two groups are explored and optimized in detail from the perspective of calculation precision, where we permute two SC implementations for inner-product calculation, two downsampling schemes, and two structures of DCNN neurons. We evaluate the network in aspects of network accuracy and hardware performance for each DCNN using one feature extraction design out of eight. Through exploration and optimization, the accuracies of SC-based DCNNs are guaranteed compared with software implementations on CPU/GPU/binary-based ASIC synthesis, while area, power, and energy are significantly reduced by up to 776X ​, 190 ​X, and 32835 ​X. Download Paper (PDF; Only available from the DATE venue WiFi) 13:01 IP1-2, 364 APPROXQA: A UNIFIED QUALITY ASSURANCE FRAMEWORK FOR APPROXIMATE COMPUTING Speaker: Ting Wang, The Chinese University of Hong Kong, HK Authors: Ting Wang, Qian Zhang and Qiang Xu, The Chinese University of Hong Kong, HK Abstract Approximate computing, being able to trade off computation quality and computational effort (e.g., energy) by exploiting the inherent error-resilience of emerging applications (e.g., recognition and mining), has garnered significant attention recently. No doubt to say, quality assurance is indispensable for satisfactory user experience with approximate computing, but this issue has remained largely unexplored in the literature. In this work, we propose a novel framework namely ApproxQA to tackle this problem, in which approximation mode tuning and rollback recovery are considered in a unified manner when quality violation occurs. To be specific, ApproxQA resorts to a two-level controller, in which the high-level approximation controller tunes approximation modes at a coarse-grained scale based on Q-learning while the low-level rollback controller judiciously determines whether to perform rollback recovery at a fine-grained scale based on the target quality requirement. ApproxQA can provide statistical quality assurance even when the underlying quality checkers are not reliable. Experimental results on various benchmark applications demonstrate that it significantly outperforms existing solutions in terms of energy efficiency with quality assurance. Download Paper (PDF; Only available from the DATE venue WiFi) 13:02 IP1-3, 241 EVOAPPROX8B: LIBRARY OF APPROXIMATE ADDERS AND MULTIPLIERS FOR CIRCUIT DESIGN AND BENCHMARKING OF APPROXIMATION METHODS Speaker: Lukas Sekanina, Brno University of Technology, CZ Authors: Vojtech Mrazek, Radek Hrbacek, Zdenek Vasicek and Lukas Sekanina, Brno University of Technology, CZ Abstract Approximate circuits and approximate circuit design methodologies attracted a significant attention of researchers as well as industry in recent years. In order to accelerate the approximate circuit and system design process and to support a fair benchmarking of circuit approximation methods, we propose a library of approximate adders and multipliers called EvoApprox8b. This library contains 430 non-dominated 8-bit approximate adders created from 13 conventional adders and 471 non-dominated 8-bit approximate multipliers created from 6 conventional multipliers. These implementations were evolved by a multi-objective Cartesian genetic programming. The EvoApprox8b library provides Verilog, Matlab and C models of all approximate circuits. In addition to standard circuit parameters, the error is given for seven different error metrics. The EvoApprox8b library is available at: www.fit.vutbr.cz/research/groups/ehw/approxlib Download Paper (PDF; Only available from the DATE venue WiFi) 13:00 End of session Lunch Break in Garden Foyer Keynote Lecture session 3.0 in "Garden Foyer" 1350 1420 Lunch Break in the Garden Foyer On all conference days (Tuesday to Thursday), a buffet lunch will be offered in the Garden Foyer, in front of the session rooms. Kindly note that this is restricted to conference delegates possessing a lunch voucher only. When entering the lunch break area, delegates will be asked to present the corresponding lunch voucher of the day. Once the lunch area is being left, re-entrance is not allowed for the respective lunch. Time Label Presentation Title Authors Source URL: https://www.date-conference.com/date17/conference/session/2.2

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating scour below inverted siphon structures using stochastic and soft computing approaches

This paper uses nonlinear regression, Artificial Neural Network (ANN) and Genetic Programming (GP) approaches for predicting an important tangible issue i.e. scours dimensions downstream of inverted siphon structures. Dimensional analysis and nonlinear regression-based equations was proposed for estimation of maximum scour depth, location of the scour hole, location and height of the dune downs...

متن کامل

A Hybrid Neural Network Approach for Kinematic Modeling of a Novel 6-UPS Parallel Human-Like Mastication Robot

Introduction we aimed to introduce a 6-universal-prismatic-spherical (UPS) parallel mechanism for the human jaw motion and theoretically evaluate its kinematic problem. We proposed a strategy to provide a fast and accurate solution to the kinematic problem. The proposed strategy could accelerate the process of solution-finding for the direct kinematic problem by reducing the number of required ...

متن کامل

Wilson wavelets for solving nonlinear stochastic integral equations

A new computational method based on Wilson wavelets is proposed for solving a class of nonlinear stochastic It^{o}-Volterra integral equations. To do this a new stochastic operational matrix of It^{o} integration for Wilson wavelets is obtained. Block pulse functions (BPFs) and collocation method are used to generate a process to forming this matrix. Using these basis functions and their operat...

متن کامل

(Q,r) Stochastic Demand Inventory Model With Exact Number of Cycles

In most stochastic inventory models, such as continuous review models and periodic review models, it has been assumed that the stockout period during a cycle is small enough to be neglected so that the average number of cycles per year can be approximated as D/Q, where D is the average annual demand and Q is the order quantity. This assumption makes the problem more tactable, but it should not ...

متن کامل

(Q,r) Stochastic Demand Inventory Model With Exact Number of Cycles

In most stochastic inventory models, such as continuous review models and periodic review models, it has been assumed that the stockout period during a cycle is small enough to be neglected so that the average number of cycles per year can be approximated as D/Q, where D is the average annual demand and Q is the order quantity. This assumption makes the problem more tactable, but it should not ...

متن کامل

Stochastic Approximate Scheduling by Neurodynamic Learning

The paper suggests a stochastic approximate solution to scheduling problems with unrelated parallel machines. The presented method is based on neurodynamic programming (reinforcement learning and feed-forward artificial neural networks). For various scheduling environments (static-dynamic, deterministicstochastic) different variants of episodic Q-learning rules are proposed. A way to improve th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017